save information parser site scraper xml development .net recovery data java database j2ee c++ custom web crawler analysis